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DETAILED ACTION 

1 . A request for continued examination under 37 CFR 1.114, including the fee set 
forth in 37 CFR 1.17(e), was filed in this application after final rejection. Since this 
application is eligible for continued examination under 37 CFR 1.114, and the fee set 
forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action 
has been withdrawn pursuant to 37 CFR 1.114. Applicant's submission filed on 2/16/06 
has been entered. 

t 

2. Claims 1-17 are pending. Claims 1, 8 and 13 have been amended. 

Claim Rejections - 35 USC § 103 

3. The following is a quotation of 35 U.S.C. 103(a) which forms the basis for all 
obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set 
forth in section 102 of this title, if the differences between the subject matter sought to be patented and 
the prior art are such that the subject matter as a whole would have been obvious at the time the 
invention was made to a person having ordinary skill in the art to which said subject matter pertains. 
Patentability shall not be negatived by the manner in which the invention was made. 

4. Claims 1-21 are rejected under 35 U.S.C. 103(a) as being unpatentable over 
Santhanam, U.S. Patent No. 5,704,053 in view of Wu, et al., (Wu), U.S. Patent 
Publication No. 2003/0066061 (art made of record). 

As per claim 1, Santhanam discloses a method for generating code to 
perform anticipatory prefetching for data references, (col. 3:47-49, "The current 
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invention provides a new compiler for such a processor that facilitates efficient insertion 
of explicit data prefetch instructions into loops within application programs"), 
comprising: 

- receiving code to be executed on a computer system; analyzing the code 
to identify data references to be prefetched (col. 3:50-51, "The compiler uses ... 
analysis (techniques) to determine data prefetching requirements"), 

- inserting prefetch instructions into a preceding basic block of the code in 
advance of the identified data references based upon code analysis (col. 3:51-53, 
"Analysis and explicit data cache prefetch instruction insertion are performed by the 
compiler", and prefetch instructions are always inserted into the code preceding the 
identified data reference to allow sufficient time for data to be prefetched), 

- wherein inserting prefetch instructions involves inserting multiple 
prefetch instructions for a given cache line (col. 6:61-62, "the system is issuing a 
redundant (prefetch) instruction (s) to the memory system to retrieve the same cache 
line"), 

-wherein inserting the prefetch instructions involves: 

- attempting to calculate a stride value for a given data reference 
within a loop (col. 6:3-5, "The compiler can predict (by attempting to calculate a 
stride value) which data (reference) is needed in advance for loops that access 
array elements in a regular fashion"), 
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- if the stride value cannot be calculated, setting the stride value to a 
default stride value (col. 14:48-49, "(if the stride cant be calculated), then 
substitute some fixed constant, C"), 

- inserting a prefetch instruction to prefetch the given data reference 
for a subsequent loop iteration based on the stride value (col. 6:5-8, "The 
compiler can then insert prefetch instructions into loops such that array elements 
that are likely to be needed in future loop iterations are retrieved from memory 
ahead of time"), 

- wherein the stride value is constant for some (but not necessarily all) loop 
iterations (col. 2:25-28 ,"because the analysis is done at the source code level, it is 
difficult to estimate the prefetch iteration distance (PFID) (in this situation the stride 
value is constant for some but not necessarily all loop iterations), i.e. the PFID used is 
always one loop iteration (i.e. default prefetch distance)"). 

Santhanam doesn't explicitly disclose: 

- calculating a prefetch ahead distance, wherein the prefetch ahead 
distance includes the ratio of outstanding prefetches to the number of prefetch 
streams, and 

- optimizing code based upon the prefetch ahead distance. 

However, Wu in an analogous environment, discloses: 
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- calculating a prefetch ahead distance, wherein the prefetch ahead 
distance includes the ratio of outstanding prefetches to the number of prefetch 
streams (para. 82:5-10, "If the frequency ratio of load instruction (i.e. the prefetch 
ahead distance) exceeds a predefined threshold, ...(the) load instruction can be 
(optimized)"), 

- optimizing code based upon the prefetch ahead distance (para. 82:1-2, "FIG. 
8 illustrates the transformation of program code (i.e. optimization) based on load value 
specialization (i.e. a prefetch ahead distance ratio)"). 

Therefore, it would have been obvious to a person of ordinary skill in the art, at 
the time the invention was made, to incorporate the teachings of Wu into the system of 
Santhanam to have a optimization of code based on a prefetch ratio. The modification 
would have been obvious because one of ordinary skill in the art would have wanted to 
use well known and well document code analysis & optimization techniques to increase 
the effectiveness of computer systems which utilize prefetch instructions. 

As per claim 2, the rejection of claim 1 is incorporated and further, Santhanam 
discloses allowing a system user to specify the default stride value (col. 13:39, 
"Estimating the average loop iteration latency"). 

As per claim 3, the rejection of claim 1 is incorporated and further, Santhanam 
discloses that calculating the stride value involves: 
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- identifying an induction variable for the stride value (col. 1 1 :23, "Identify 
simple basic loop induction variables"), 

- identifying a stride function for the stride value and calculating the stride 
value based upon the stride function and the induction variable (col. 17:54-60, "a 
net loop increment of eight, and the element size of "A" is 8-bytes, this is a large stride 
equivalence class, assuming a 32-byte cache line size (8.times.8 bytes=64 bytes)>32 
bytes"). 

As per claim 4, the rejection of claim 1 is incorporated and further, Santhanam 
discloses that inserting the prefetch instruction based on the stride value involves: 

- calculating a prefetch cover distance by dividing a cache line size by the 
stride value (col. 15:64-67, "When the memory stride is <=cache line size, B(i) is 
considered to be in the same cluster as B(i+1), and therefore omitted for prefetch 
consideration (i.e. the prefetch cover distance is calculated based on the cache line size 
and stride value )", and col. 17:54-66, "(Because the loop has) a net loop increment of 
eight, and the element size of "A" is 8-bytes, this is a large stride equivalence class, 
assuming a 32-byte cache line size (8.times.8 bytes=64 bytes)>32 bytes. All eight 
references to " A" are placed into the same cluster because they exhibit group spatial 
locality, and no group temporal locality. The cluster leader is the reference to A[i+7], and 
the span of the cluster is 64-bytes (i.e. &A[i+7]-&A[i]). If the prefetch memory distance 
was computed earlier to be 128-bytes, i.e. corresponding to a prefetch iteration distance 
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of two, it is only necessary, to insert three prefetch instructions to account for the entire 
span of this 8-member cluster.") , 

- calculating a prefetch ahead distance as a function of a prefetch latency, 
the prefetch cover distance and an execution time of a loop (col. 7:1 1-18, "The 
memory address is determined based on the number of loop iterations in advance (i.e. 
the prefetch iteration distance or PFID) that data items need to be prefetched to fully 
hide the time required to service potential data cache misses. The PFID is determined 
taking into account the nature of the loop body instructions (i.e. execution time of the 
loop and the prefetch cover distance) and characteristics of the target processor and 
memory system (i.e. the prefetch latency and prefetch cover distance)"), 

- calculating a prefetch address by multiplying the stride value by the 
prefetch cover distance and the prefetch ahead distance and adding the result to 
an address accessed by the given data reference (col. 7:11-18, "The memory 
address is determined based on the number of loop iterations in advance (i.e. the 
prefetch iteration distance or PFID) that data items need to be prefetched to fully hide 
the time required to service potential data cache misses. The PFID is determined taking 
into account the nature of the loop body instructions and characteristics of the target 
processor and memory system."). 

As per claim 5, the rejection of claim 1 is incorporated and further, Santhanam 
discloses that analyzing the code involves: 
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- identifying loop bodies within the code; identifying data references to be 
prefetched from within the loop bodies (col. 8:30-35, "One important feature of the 
invention identifies loops and access patterns to allow a determination of how many 
cycles are devoted to loop iterations, and therefore allows insertion of the prefetch 
instruction to a location of an array that is sufficiently far in advance to make sure that 
the miss time is minimized."). 

As per claim 6, the rejection of claim 5 is incorporated and further, Santhanam 
discloses that analyzing the code to identify data references to be prefetched involves 
examining a pattern of data references over multiple loop iterations (col. 14:6-10, 
"Now, it is also necessary to address the issue of loops that have internal branches. The 
minimum loop iteration latency for such loops is estimated by using previously collected 
execution profile information, which indicates the execution count for each basic block in 
the loop body."). 

As per claim 7, the rejection of claim 1 is incorporated and further, Santhanam 
discloses that analyzing the code involves analyzing the code within a compiler (col. 
3:47-49, "The current invention provides a new compiler for such a processor that 
facilitates efficient insertion of explicit data prefetch instructions into loops within 
application programs"). 
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As per claims 8-12, this is a computer readable medium/product version of the 
claimed method discussed above, in claims 1-7 , wherein all claimed limitations have 
also been addressed and/or cited as set forth above. For example, see Santhanam, 
col. 3:47-49 and Wu, para 25:1-26:11 and 82:1-10. 

As per claims 13-17, this is an apparatus version of the claimed method 
discussed above, in claims 1-7 , wherein all claimed limitations have also been 
addressed and/or cited as set forth above. For example, see Santhanam Fig. 1 item 10, 
"computer architecture" and associated text and Wu, and Wu, para 25:1-26:11 and 
82:1-10. 

Response to Arguments 

5. Applicants arguments have been considered but they are not persuasive. 

In the remarks, the applicant has argued substantially that: 

1) Santhanam does not disclose inserting prefetch instructions based on the ratio of 
outstanding prefetches and the number of prefetch streams , at p. 9:9-1 1 . 

Examiner's response: 

1) Applicant's arguments with respect to claims 1 , 8 and 13 have been considered 
but are moot in view of the new ground(s) of rejection. The Santhanam/Wu combination 
discloses inserting prefetch instructions based on the ratio of outstanding prefetches 
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and the number of prefetch streams, at Santhanam, col. 3:47-49 and Wu, para 25:1- 
26:11 and 82:1-10. Additionally, while the terms "outstanding prefetch" and "prefetch 
stream" are mentioned in the specification (p. 15:21-23), the examiner suggests that the 
applicant include the definition of the these terms to aid in future prosecution and 
provide a clear record. 



6. Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to Andre R. Fowlkes whose telephone number is (571) 
272-3697. The examiner can normally be reached on Monday - Friday, 8:00am- 
4:30pm. 

If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, Tuan Q. Dam can be reached on (571)272-3695. The fax phone number for 
the organization where this application or proceeding is assigned is 571-273-8300. 

Information regarding the status of an application may be obtained from the 
Patent Application Information Retrieval (PAIR) system. Status information for 
published applications may be obtained from either Private PAIR or Public PAIR. 
Status information for unpublished applications is available through Private PAIR only. 
For more information about the PAIR system, see http://pair-direct.uspto.gov. Should 
you have questions on access to the Private PAIR system, contact the Electronic 
Business Center (EBC) at 866-217-9197 (toll-free). s\ 



Conclusion 




TUAN DAM 
SUPERVISORY PATENT EXAMINER 



